Introduction to PyTorch: Why Tensors Matter

PyTorch is a highly flexible, dynamic open-source framework favored for deep learning research and rapid prototyping. At its core, the Tensor is the indispensable data structure. It is a multi-dimensional array designed to efficiently handle numerical operations required for Deep Learning models, supporting GPU acceleration automatically.

1. Understanding Tensor Structure

Every input, output, and model parameter in PyTorch is encapsulated in a Tensor. They serve the same purpose as NumPy arrays but are optimized for processing on specialized hardware like GPUs, making them far more efficient for the large-scale linear algebra operations required by neural networks.

Key properties define the tensor:

Shape: Defines the dimensions of the data, expressed as a tuple (e.g., $4 \times 32 \times 32$ for a batch of images).
Dtype: Specifies the numeric type of elements stored (e.g., torch.float32 for model weights, torch.int64 for indexing).
Device: Indicates the physical hardware location: typically 'cpu' or 'cuda' (NVIDIA GPU).

Dynamic Graph and Autograd

PyTorch uses an imperative execution model, meaning the computational graph is built as operations are executed. This enables the built-in automatic differentiation engine, Autograd, to track every operation on a Tensor, provided the property requires_grad=True is set, allowing for easy calculation of gradients during backpropagation.

TERMINAL bash — pytorch-env

> Ready. Click "Run" to execute.

TENSOR INSPECTOR Live

Run code to inspect active tensors

Question 1

Which command creates a $5 \times 5$ tensor containing random numbers following a uniform distribution between 0 and 1?

torch.rand(5, 5)

torch.random(5, 5)

torch.uniform(5, 5)

torch.randn(5, 5)

Question 2

If tensor $A$ is on the CPU, and tensor $B$ is on the CUDA device, what happens if you try to compute $A + B$?

An error occurs because operations require tensors on the same device.

PyTorch automatically moves $A$ to the CUDA device and proceeds.

The operation is performed on the CPU, and the result is returned to the CPU.

Question 3

What is the most common data type (dtype) used for model weights and intermediate calculations in Deep Learning?

torch.float32 (single-precision floating point)

torch.int64 (long integer)

torch.bool

torch.float64 (double-precision floating point)

Challenge: Tensor Manipulation and Shape

Prepare a tensor for a specific matrix operation.

You have a feature vector $F$ of shape $(10,)$. You need to multiply it by a weight matrix $W$ of shape $(10, 5)$. For matrix multiplication (MatMul) to work, $F$ must be 2-dimensional.

Step 1

What should the shape of $F$ be before multiplication with $W$?

Solution:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code: F_new = F.unsqueeze(0) or F_new = F.view(1, -1)

Step 2

Perform the matrix multiplication between $F_{new}$ and $W$ (shape $(10, 5)$).

Solution:
The operation is straightforward MatMul.
Code: output = F_new @ W or output = torch.matmul(F_new, W)

Step 3

Which method explicitly returns a tensor with the specified dimensions, allowing you to flatten the tensor back to $(50,)$? (Assume $F$ was $(5, 10)$ initially and is now flattened.)

Solution:
Use the view or reshape methods. The fastest way to flatten is often using -1 for one dimension.
Code: F_flat = F.view(-1) or F_flat = F.reshape(50)